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Abstract 

In this work, we investigate the optimal dynamic packet scheduUng poUcy in a 
wireless relay network (WRN). We model this network by two sets of parallel queues, 
that represent the subscriber stations (SS) and the relay stations (RS), with random link 
connectivity. An optimal policy minimizes, in stochastic ordering sense, the process of 
cost function of the SS and RS queue sizes. We prove that, in a system with symmetrical 
connectivity and arrival distributions, a policy that tries to balance the lengths of all 
the system queues, at every time slot, is optimal. We use stochastic dominance and 
coupling arguments in our proof. We also provide a low-overhead algorithm for optimal 
policy implementation. 

Keywords: Optimal scheduling, wireless relay network, cooperative diversity, coupling argu- 
ments, stochastic ordering, MB policies. 

1 Introduction 

Fourth generation (4G) wireless systems arc high-speed cellular networks with peak download 
data rates of 100 Mbps. IEEE 802. 16j task group recommended the use of relay nodes and 



cooperation in 4G networks design [T]. In these systems, dedicated wireless relay nodes are 
deployed in order to achieve cooperative diversity. Wireless relays are spread over the coverage 
range of the cell. They have the effect of increasing coverage within a cell and facilitating the 
targeted data rate for 4G mobile users. These relays usually possess limited functionality and 
have low power consumption. Consequently, they are significantly cheaper than a full-scale 
base station. Early studies of cooperative communication were initiated by ||2j, ^ and [1]. 
Since then, the subject attracted the attention of many researchers, cf. [5] [6]. 

Most of the existing work in this area aimed at exploiting the diversity and multiplexing 
gain to improve some performance criteria, e.g., capacity and bandwidth utilization, outage 
probability, error rate, etc. These are often achieved through the use of adaptive modulation 
techniques, distributed space-time coding, or error-correction coding. In this work, we study 
this problem from a different perspective, the dynamic packet scheduling perspective. We 
are interested in the scheduling of packets on the uplink of a wireless relay network (WRN). 
Each of the subscriber or the relaying nodes is assumed to have a time-varying channel that 
can be modelled as a random process. We present a queueing model that captures the packet 
buffering, scheduling and routing processes as well as the intermittent channel connectivity in 
such network. We then use this model to study dynamic packet scheduling in such networks. 

Dynamic packet scheduling enables the redistribution of the available resources to improve 
network performance. Furthermore, optimal packet scheduling policies can be determined un- 
der various operating constraints to optimize various performance criteria. This motivated 
the investigation of the optimal control problem that we present here. The inherent random- 
ness of the wireless channel and the dynamic configuration of the nodes in wireless networks 
create a formidable challenge to such investigation. Therefore, it is wise and often neces- 
sary in such cases to make simplifying assumptions that result in mathematically tractable 
problem formulations. Otherwise, optimality results may not be attainable. 

In this article, we investigate an optimal dynamic packet scheduling policy in a wireless 
relay network (WRN) composed of a base station (BS), L subscriber stations (SS) and K 
relay stations (RS), for any arbitrary L and K. This network is modelled by two sets of 
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Figure 1: A queueing model for dynamic packet scheduling in wireless relay networks. 

queues with infinite size (see Figure [T]). The wireless channel in such network is varying 
with time and can best be described by a random process. This assumption is widely used 
in literature, cf. [11], [12] and many other. A wireless link is assumed to be 'connected' 
with probability p and 'not connected' otherwise. We further assume that the connectivity 
processes are independent across the wireless links. The transmission frame is divided into 
two halves; during the first half a SS node is scheduled to transmit (to a selected RS) and 
during the second half a RS node is scheduled to transmit to the base station. This model 
can be used to study scheduling and routing algorithms in multi-hop, wireless networks such 
as relay assisted, fourth generation wireless networks. 

The optimization cost that we consider in this work is a monotone, non-decreasing func- 
tion of the system queue occupancy. We prove using stochastic dominance [Sj and coupling 
arguments [10], that a most balancing (MB) policy minimizes, in stochastic ordering sens^ 
the cost function random process. 

1.1 Previous Work 

The problem we are investigating lies in the area of optimal control in queueing networks. 
A related problem was first studied by Roseburg et al in [8|. They investigated an optimal 

^Stochastic dominance is a stronger optimization notion than the expected cost minimization since the 
former implies the later; however, the reverse is not true [?]• 




control policy of service rate in a system of two queues in tandem. They proved, using 
dynamic programming argument, that a threshold policy is optimal in that it minimizes the 
expected cost function of the queue occupancy. 

Another related work to the optimal control problem we are presenting here was reported 
by Tassiulas and Ephremides in [TT]. They considered a model of parallel queues with a 
single server; they showed that a LCQ policy, a policy that allocates the server to its longest 
connected queue, minimizes the total number of packets in the system. The authors in [12] 
studied a satellite node with K transmitters. They modelled the system by a set of parallel 
queues with symmetrical statistics competing for K identical servers. At any given time slot 
in this model: (i) a server is connected to either all or no queues at all, (ii) at most one server 
can be allocated to each scheduled queue. Using stochastic coupling arguments, the authors 
proved that an LCQ (Longest Connected Queue) policy is optimal. 

In previous work [13], we studied the problem of optimal scheduling in a multi-server 
system of parallel queues with random queue-server connectivity. We relaxed assumptions 
(i) and (ii) in [12] above. We proved, using coupling arguments, that a Most Balancing (MB) 
policy is optimal in that it minimizes, in stochastic ordering sense, a cost function of the 
queue sizes in the system. 

In [H], the authors proposed a cooperative multiplexing and scheduling algorithm for a 
wireless relay network with a single relay node. They showed that this algorithm outperforms 
the traditional opportunistic techniques in terms of spectral efficiency. The authors in [IS] 
studied link scheduling in WRN with bandwidth and delay guarantees. They modelled the 
system using simple directed graph. They proposed an efficient algorithm to provide delay 
guarantee over WRN. 

For a wireless relay network, the choices of relay node, relay strategy, and the allocation of 
power and bandwidth for each user are important design parameters which were investigated 
thoroughly in recent literature. Relay selection and cooperation strategies for relay networks 
have been investigated by [TB] and [T7] among others. Power control has been investigated by 
[T5] and [in] and many others. However, the modern wireless networks are mostly IP-based, 



and therefore the optimization problem may be reduced to finding the optimal dynamic 
packet scheduling pohcies in these networks. 



1.2 Our Contributions 

In this work, we developed a queueing model to study the process of packet scheduling in 
wireless relay network. Our main contributions can be summarized by the following: 

• We develop a queueing model to study packet scheduling in WRN, see Figure [TJ 



We introduce (in Equation 15) and show the existence of the class of optimal scheduling 
policies (i.e., the MB policies). We prove their optimality (Theorem [2]) for packet 
scheduling in this model. The optimality criterion we use is stochastic ordering and the 
cost that is minimized belongs to a set of functionals of the SS and RS queue lengths. 

We provide an implementation algorithm for packet scheduling policies in WRN. We 
also prove that this algorithm results in a MB policy. 



The model we are presenting in this article differs from the previous work (in section 1.1) 
in that it contains two sets of parallel queues in tandem rather than just one. At every time 
frame, the scheduler in this case must decide which SS node transmits during the first half of 
the frame, which RS queue receives the transmitted packets and which RS node transmits (to 
BS) during the second half of the frame. The task of solving this problem is quite challenging. 
The dependency between the two sets of queues (i.e., SS and RS), mainly the dependency of 
the scheduling controls U2it) on Ui(t) and Usit) on U2(t) and Ui(t), added more complexity 
to the solution of this optimization problem. The approach we used in our proof (section |4]) 
addressed this issue rigorously. The model and results we present here will help provide a 
sound theoretical ground to the problem of packet scheduling in WRN and can also be used 
to study multi-hop wireless networks in general. 

The rest of this article is organized as follows; in Section 11, we present a detailed descrip- 
tion of the queueing model under investigation. In Section III, we define the class of "Most 



Balancing" policies. We present the optimality results in Section IV. Conclusions are given 
in Section V. Proofs for some of our results are given in the appendix. 

2 Model Description 

We model the WRN by a discrete-time queueing system as shown in Figure [T] The objective 
is to find the optimal dynamic packet scheduling policy for this network. The optimal policy 
is the one that minimizes a cost function of the queue lengths (to be defined shortly). 

In this model, time is slotted into constant intervals each of which is equal to one trans- 
mission frame. At every time slot, the following sequence of events happen: (a) the system 
state (queue sizes and connectivities) is observed, (b) a scheduler action (or control decision) 
is selected, and (c) the exogenous arrivals are added to their respective SS queues. The 
scheduler action involves (i) selecting a SS node to transmit to a RS node (denoted by Ui(t)), 
(ii) selecting the RS node that the scheduled packet is routed to (denoted by U2{t)), and (iii) 
selecting a RS node to transmit to the base station (denoted by U^it)). These actions are 
sequentially executed with order Ui{t),U2{t) then U^lt). A packet that arrives during the 
current time slot can only be considered for transmission in the subsequent time slots. We 
assume that the scheduler has complete knowledge of the system state when the decision time 
arrives. This is a realistic assumption for most infrastructure-based networks since they use 
a centralized control provided by the base station. They deploy a dedicated control channel 
that can be used to communicate such information. 

2.1 Formulation and Statistical Assumptions 

We define the following notation that we use to describe the model under investigation. 
Throughout this paper, we will use UPPER CASE, bold face and lower case letters to 
represent random variables, vector/matrix quantities and sample values respectively. In our 
notation, we define two dummy queues, one SS and one RS, that we denote by the index 
'0'. These queues are used to represent the idling action, i.e., a dummy packet is removed 



from queue when no real packet from real queue is scheduled for transmission. We assume 
that the dummy queues have full connectivity at all times and initial sizes of 0. The dummy 
queues are required in order to facilitate the mathematical formulation of this optimal control 
problem. Let £ = {0, 1, . . . , L} (respectively /C = {0, 1, . . . , K}) be the set of indices for the 
SS (respectively RS) stack of queues. For any time slot t = 1, 2, . . ., we define the following: 

• X(t) = (Xo(t), Xi(t), . . . , Xi(t)) is the queue length vector for SS nodes (measured 
in number of packets) at the beginning of time slot t, where Xi{t) G {0, 1, 2, . . .} and 
Xo(l) = 0, i.e., we assume that the dummy queue is initially empty. 

• Y(t) = {Yo{t), Yi(t), . . . , Yiit)) is the queue length vector for RS nodes at the beginning 
of time slot t, where Yi(t) G {0, 1, 2, . . .}. We assume that Yo(t) = for all t. 

• A(t) = {Aoit) , Ai(t) , . . . , AL{t)) , where Ai{t) is the number of exogenous arrivals to SS 
queue i during time slot t. 

• C%t) = (Q(t), cm, . . . , cat)) (resp. C^it) = (Q(t), ^[(t), . . . , C],(t))) is the chan- 
nel connectivity for SS (resp. RS) nodes during time slot t, where CQ(t) = Cg(t) = 1, Vt. 

• U(t) = {Ui{t),U2it),U3{t)), s.t. Ui{t) G C,U2{t),Um e /C, is the scheduler decision 
(or control), where U(t) = (i, j, k) means that SS node i is scheduled to transmit to RS 
node j during the first half of time slot t, and RS node k is scheduled to transmit to 
the BS during the second half of time slot t. 

For ease of reference, we refer to the state of queue lengths and connectivities, i.e., the 
tuple (X(t), Y(t),C"(t),C''(t)), as the system "state" (denote by S(t)) at time slot t. 

We make the following statistical assumptions regarding the random processes in the 
system. The arrival processes [Ai{t),i = 1, . . . ,L) are assumed to be i.i.d. BernouUj^ with 
parameter q. However, the arrivals to any RS queue at time t is equal to the number of 

^This assumption is widely used in the literature for analytic studies and optimization of wireless networks 
[n], [12] and [I3]. 



packets transmitted from a SS node to that RS node during that time slot. For convenience, 
we define ^o(^) = l^o(^)) where WQ{t) is the number of packets withdrawn from queue 
during time slot t, in order to ensure that Xo{t) — for all t. Furthermore, transmitted 
dummy packets (i.e., fictitious packets from dummy queues) will not be added to the receiver 
queue (the RS queue that the packet is routed to). This assumption is intuitively correct, 
since fictitious packet is generated only when there is no real packet transmission. 

The connectivity processes Cf (t) and Cj{t), for all i = 1, . . . ,L and j = 1, . . . ,K are 
assumed to be independent 2-statc channels with connection probability p. It is further 
assumed that the connectivity and arrival processes are independent of each other. 

Some of the statistical assumptions that we enforce are necessary for the tractability of 
the solution for this problem. Others can be relaxed for the cost of more complexity. For 
future work, we propose to relax some of these assumptions. 

We define next the 'withdrawal' and the 'insertion' controls as a function of the scheduler 
control U(i) in order to simplify problem formulation and the proof of our results. 

2.2 Feasible Withdrawal/Insertion Control Vectors 

Let denote the indicator function for condition B. At any given time slot t, we define 
the SS (respectively the RS) withdrawal vector W*(i) (respectively W^{t)) as follows: 

Wiit) = l{c/,(i)=.}, Vie A and, (1) 
WJ{t) = t{us(t)=j}, Vje/C, (2) 

where W^{t) (respectively WJ{t)) represents the number of packets withdrawn from SS queue 
i (respectively RS queue j) during time slot t. We also define the RS insertion vector as: 

VJ-{t)^t{U,it)=j}: j = (3) 

where VJ'{t) represents the number of packets inserted to RS queue j during time slot t. Note 
that we do not allow real packets to be inserted in the dummy queue. Similarly, we do not 
allow dummy packets to be inserted into real queues. 



In the system described above and for any (feasible) withdrawal/insertion controls, the 
queue length for any SS node evolves according to the following relation: 

X(t + l) = X(t)-W^(t) + A(t) (4) 

Similarly, the queue length evolution for any RS node is given by the following relation: 

Y(t + 1) = Y(t)+V^(t)-W^(t) (5) 

It is apparent, from Equations Q and ([s]) that Equations Q - (|3]) do not guarantee 
feasibility of the withdrawal /insert ion vectors and hence the scheduling control vector U(t). 
Therefore, we provide the following feasibility condition: 

A vectors U(t) is said to be a ^feasible scheduling controV if the following condition is 
satisfied: 'a packet may only be withdrawn from a connected, non-empty queue\ Formally, 
given the system state S(t) during time slot t, a scheduling control vector U(t) is feasible if and 
only if the resulted withdrawal/insertion vectors satisfy the following feasibility constraints: 

0< W!{t) <l{x.w>o}-Q(t), Vz^O, (6) 
0< W^^{t) < l{y,.(t)>o} ■ C;(t), VjVO, (7) 

L K 

Y,W:{t) = l, Y,W]{t) = l. (8) 

According to Constraint ([6]), a packet is withdrawn from a SS queue i only if queue i is 
connected and non-empty, i.e., Xi{t) > and C-(t) = 1. Similarly, according to Constraint 
([T]), a packet can only be withdrawn from a connected, non-empty RS queue. Constraint ([s]) 
insures that only one SS node and one RS node are allowed to transmit at any given time t. 
Let U{S{t)) be the set of all feasible scheduling controls when the system in state S(t). 

2.3 Policies for Dynamic Packet Scheduling 

A packet scheduling policy vr (or policy tt for short) is a rule that determines the feasible 
control vectors U(t) for all t as a function of the past history and current state of the system. 



where the state history ii{t) is given by the following sequence of random variables 



H(l) = (X(1),Y(1)), and for t > 2 : 

U{t) = (X(l), Y(l), C^(l), C^(l), A(l), . . . , C%t-1), C^(t-l), A(t-l), C%t), C^{t)) (9) 

Let Ht be the set of all state histories up to time slot t. Then a policy vr can be formally 
defined as the sequence of measurable functions 

g.-.Ut^Zl, s.t. gt{ll{t))eU{S{t)), t=l,2,... (10) 

where Zj^ is the set of non-negative integers. 

The set of feasibl^ scheduling policies described in Equation (10) is denoted by 11. We 



are interested in a subset of 11 that we will introduce in the next section, namely the class of 
Most Balancing (MB) policies. The main objective of this work is to prove the optimality of 
MB policies among all policies in 11. 



3 The Class of MB Policies (H^^^) 

In this section, we provide a description and mathematical characterization of the class of MB 
policies. Intuitively, the MB policies attempt to balance the sizes (leftover) of the SS queues 
as well as the RS queues in the system. This can be achieved by minimizing the queue length 
differences for the two sets of queues, at every time slot t. We present next a more formal 
characterization of MB policies. We first define the imbalance index' (k(x)) of a vector x. 
Let X G Z^^ be an M-dimensional vector. The imbalance index of x is defined as follows: 

M-l M 

fi:(x) : Zl^ I — > Z+, k(x) = ^ Y1 ~ ^W^' ^^^^ 

i=l j=i+l 

where [k] denotes the index of the fc*'' longest component in the vector x. 

■^We say that a policy tt is feasible if it selects a feasible scheduling control XJ^{t) E U{S{t)) for all t. 



The above definition ensures that the differences are nonnegative and a pair of components 
is accounted for in the summation only once. We define next the ^^balancing interchange" for 
the vector x. We use this operation in the proof for the optimality of MB policies. 
Definition: Balancing Interchange: Given vectors x, x* G , we say that x* is obtained 
from X by performing a balancing interchange if the two vectors differ in two components 
i > and j > only, where 

X* = Xi — 1, X* = Xj + 1, s.t. Xi > Xj + 1. (12) 

To put the above definition into perspective, if the vector x represents a queue sizes vector 
then a balancing interchange would involve the removal of one packet from a larger queue i 
and the insertion of that packet to a smaller queue j. We will show later (in Lemma [2]) that 
such an interchange will decrease the imbalance index of the vector. 

Given a state s{t) and a policy vr that chooses the feasible scheduling control u(t) G U{s(t)) 
at time slot t; define the "updated" queue sizes, Xi{t) and yjit), as the sizes of these queues 
after applying the control u(t) and just before adding the exogenous arrivals during time slot 
t. Note that because we let ZQ{t) = WQ(t), xo(t) may be negative. The updated queue sizes 
can be stated as follows: 

Xi{t) = Xi{t)—Wi(t), i G C, and, (13) 
m = yAt) + Vjit)-w^,{t), JG/C (14) 

At any given time slot t, the imbalance indices for the updated SS and RS queue length 
vectors x(t) and y(t) are given by K(x(t)) and K(y(t)). The L + 1'^* SS queue as well as the 



K + V^ RS queue are the dummy queues defined in the previous section. From Equation ( 11 ), 
it follows that the minimum possible value of the imbalance index for a M + 1-dimensional 
vector X is equal to M ■ X[m] which is indicative of a fully balanced system. 

We denote by n*^'^ the set of all MB policies in the system. We define the elements of 
U^^ as follows: 



Definition: Most Balancing Policies: A Most Balancing (MB) policy is a policy vr G 11 
that, at every t = 1,2, . . ., chooses feasible scheduling control vector u(t) G U{s(t)) such that 
both imbalance indices /t(x(t)) and /t(y(t)) are minimized, i.e., 

n^'^ = Itt en : argmin K(x(t)) Pi argmin fi;(y(t)),Vt| (15) 



In Equation (15), two sets of policies are defined through the two argmin functions. 
Policies in the first (respectively the second) set minimize the imbalance index for the SS 
(respectively the RS) queue length vector. The intersection of the two sets results in a set 
of policies that minimize the imbalance index for both vectors. We say that a policy has the 



"MS property" during time slot n, if it choses a control that satisfies Equation (15) at t = n 



Then a MB policy can be defined as the policy that has the MB property at every time slot. 



The set Yl^^ in (15) is well-defined and non-empty, since the minimization is over a finite 



set of controls. Furthermore, the set of MB policies may have more than one element. 
3.1 MB Policy Implementation 

In this section, we provide a low-complexity heuristic algorithm (LCQ/SQ/LCQ) to imple- 
ment MB policies. This algorithm is defined next: 

Definition: Algorithm LCQ/SQ/LCQ: For every time slot t. Algorithm LCQ/SQ/LCQ 
selects the feasible vector u(t) such that ui(t) is the longest connected SS queue, U2(t) is the 
shortest RS queue, and u^it) is the longest connected RS queue. That is 

Ui{t) = : Z"" G argmax Xi{t) (16) 

i6£:<(t) = l 

U2(t) = s*" : s'' G argmaxc[(t), X= argmin yj{t) (17) 

iex je{i,...,K} 

usit) = r-.re argmax (?/,(t) + w!:(t)) (18) 

je/C:cJ(t) = l 

where v'"(t) is the RS insertion vector at time slot t. For SS queue we add one extra 
condition for the sake of mathematical accuracy, that is: "// Ui{t) = then U2{t) = 0." This 
may happen when the controller is forced to idle during the first half of the frame. □ 



Equation (17) identifies the shortest RS queue; if there are more than one RS queue that 
satisfy this condition, one of which (at least) is connected, then the connected one is the one 
selected as U2it). Otherwise, U2(t) will be the shortest non-connected RS queue. The reason 
behind this extra condition is a special case where all the RS queues have the same size, then 
RS queue U2(t) will be the longest RS queue after adding the packet transmitted from SS 
queue ui{t). Selecting a connected RS queue in this case will provide the opportunity for the 
scheduler to select the longest SS queue as u^it). 

Cellular networks, including 4G wireless networks, are mostly infrastructure-based net- 
works. Therefore, a centralized approach can be used for the implementation of packet- 
scheduler (i.e., in BS). Furthermore, in modern cellular networks a pilot channel is used to 
estimate, among other things, the channel signal-to-noise ratio by measuring the received 
signal power at the receiving end. In this case, the channel state information (CSI) as well 
as the queue state information can be made available to the controller with minimal efforts. 

Lemma 1. Algorithm LCQ/SQ/LCQ results in a feasible control vector u{t) for any t. 



Proof. According to Equations (16) - (18), packets are withdrawn from connected queues 
only. Furthermore, packets are withdrawn from the longest connected queue for both SS and 
RS stack of queues. This will insure that as long as there is at least a single connected, non- 
empty queue then the LCQ will not be empty. Therefore, Equations ^ and ([T]) are satisfied. 
Furthermore, Equation (^ is satisfied by definition of the scheduler control u(t). □ 



The following theorem states that the policy resulted from the proposed implementation 
algorithm is indeed a MB policy. 

Theorem 1. For the operation of the system presented in Section\^ and shown in Figure\^ 
a MB policy can he constructed using Algorithm LCQ/SQ/LCQ. 

To prove Theorem [1} we need Lemma |2] below. It quantifies the effect of performing a 
balancing interchange on the imbalance index k(x) of the L + 1-dimensional vector x. The 
proof of the lemma is given in the appendix. 



Lemma 2. Let x and x* be two L + l-dimensional ordered vectors (in descending order); 
suppose that x* is obtained from x by performing a balancing interchange of two components, 
I and s, of li., where xi > Xg, such that, s > l]Xi > Xa^'^a > I and Xg < Xb,Wb < s. Then 



= k(x) - 2(S - /) ■ l{xi>Xs+2} 



(19) 



3.1.1 Proof for Theorem [T] 

We prove Theorem [T] by contradiction. We assume that a MB pohcy selects a control u(t) 



at t that does not satisfy Equations (16) - (\l8h. The control vector selected by Algorithm 



LCQ/SC/LCQ is feasible according to Lemma[T] Then using Lemma[2]we show that applying 
the controls selected by Equations (16) - (fTsl) will result in imbalance indices K(x(t)) and 



K{y{t)) that are smaller than those under the MB policy which contradicts Equation (15). 



Therefore, u{t) must satisfy Equations (16) - (18) and the theorem follows 



Proof for Theorem^ Given the system state s{t) = (x(t), y(t), c*(t), c^'(t)) at time slot t; let 



be the index of the longest connected SS queue (as in Equation (16)) and be the index of 



the shortest RS queue before executing the control u(t) that satisfies Equation (17); let F be 
the index of the longest connected RS queue after executing the controls Ui(t) and U2(t) and 
just before executing the control u^lt) (as in Equation (0). Let vr G B*^^ be a MB pohcy 



that selects the scheduler control u{t) G V({s{t)) during time slot t. To show a contradiction, 
we assume (to the contrary of Theorem [T]) that u(t) does not satisfy Equations ( [l6| ) - (18). 

We show next that in this case, the control vector selected by Algorithm LCQ/SQ/LCQ 
during time slot t will result in an imbalance index that is either (i) less than or (ii) equal 



to that obtained under a MB policy. Case (i) contradicts Equation (15); therefore, the MB 
policy must satisfy Equations (16) - (jls)). Case (ii) insures that LCQ/SQ/LCQ satisfies 



Equation (15). In either case. Theorem [T] will follow. 



Consider the following three cases corresponding to Equations (16), (17) and (18): 



1) Xui(t){t) < xis{t), i.e., ui{t) does not satisfy Equation (16) during time slot t. Then 
Xui{t){t) < xis[t) — 1 (under vr). According to Equation (12) we can perform a balancing 



interchange between components Ui{t) and that will reduce the imbalance index K(x(t)). 



Therefore, vr does not satisfy Equation (15) and hence it is not a MB policy. This contradicts 
the original assumption that vr G 11^^'^. Therefore, we conclude that a MB policy must satisfy 
Equation (16). Note that > Xis(t) is not possible since queue Ui(t) must be connected 

(feasibility constraint ([6|) and queue l'^ is the longest connected queue by assumption. 



2) yu2{t)if) > Vs'-it), i.e., U2{t) does not satisfy Equation (17) during time slot t. Then 
yu2{t)if) + '^M2(i)'^^) ^ Vs^if) + vlrif) + 1. Similar to the previous case, we can perform a 
balancing interchange between queues U2{t) and s*". Again this will reduce the imbalance 



index t^iyif) + v'"(t)). Therefore, vr does not satisfy Equation (15) and hence it is not a MB 



policy. This contradiction leads us to conclude that vr must satisfy Equation (17). Since s** is 
the shortest queue by assumption, then yu2{t)if) < Vs^if) is not possible. However, if yu2{t)if) = 
ys^it) s.t. U2{t) 7^ s^; in this case, if c^2(i)(^) ~ '^l^i^) then U2(t) satisfies Equation (17) during 
time slot t. Otherwise, i.e., c^r(t) > = 0, then ?/«2w(^) + ^^aW W = Us^it) + 1- 

In this case, if u^it) = s"^ then yu2{t)if) = ys'-(t) + 2. A balancing interchange between queues 
U2{t) and will reduce the imbalance index K{y{t)). Therefore, vr does not satisfy Equation 



(15) and hence it is not a MB policy. By contradiction vr must satisfy Equation (17). 

If on the other hand U3{t) ^ s'" then a policy that choses either U2{t) or while keep- 
ing 'Ui(t) and M3(t) the same will result in the same imbalance index. Since vr G 11^''^'^ by 
assumption, then LCQ/SQ/LCQ G 11^^^ as well. 



3) yu-i{t){t) + '^ug(t)(^) < yr{t) +'Vir{t), i.e., ^(t) does not satisfy Equation (18) during 
time slot t. Then yu3{t){t) < yi^{t) — 1. Again we can perform a balancing interchange 
between queues u^{t) and /*" that will result in a reduction of the imbalance index K{y{t). 



Therefore, vr does not satisfy Equation (15) and hence it is not a MB policy. This contra- 
diction leads us to conclude that a MB policy vr must satisfy Equation (fl 



Since T is 

the longest connected queue by assumption and given the feasibility constraint ([T]), the case 
where yusit)(t) + v^^^^^it) > yir{t) + vlr(t) is not possible. 

The above cases are the only possible cases. We conclude that a MB policy n G n^''^'^ 
satisfies Equations (16) - (18) and Theorem [l] follow. □ 



4 Optimality of MB Policies 



In this section, we provide a proof for the optimahty of the Most Balancing (MB) pohcies. 
We start by defining a partial order to facilitate the comparison of the cost functions under 
different policies. We also define the class of cost functions for the optimality problem that 
we investigate in this section. 

4.1 Definition of the Partial Order 

In order to prove the optimality of MB policies, we devise a methodology that enables com- 
parison of the queue lengths under different policies. The idea is to define an order that we 
call the "preferred order" and use it to compare queue length vectors, for the SS queues as 
well as the RS queues, under different policies. We start by defining the relation □ on 
for some M > as follows; we say that the two vectors x and x are related via x □ x if: 

51- Xi < Xi for all i (i.e., point wise comparison), 

52- X is a 2-component permutation of x; the two vectors differ only in two components i 
and j, such that Xi = Xj and Xj = Xi, or 



S3- X is obtained from x by performing a "balancing interchange" as in Equation (12). 



Definition: The preferred order (^) is defined as the transitive closure of the relation □ 
on the set Z^^ , M > 0. □ 
The transitive closure of C on the set Z^^ is the smallest transitive relation on Z'^^ that 
contains the relation □ [20j. Intuitively, x ^ x if the vector x is obtained from x by performing 
a sequence reductions, permutations of two components and/or balancing interchanges. 

4.2 Definition of the Class of Cost Functions T 

We denote by J-" the class of real-valued functions on the set Z'^^ that are monotone and 
non-decreasing with respect to the partial order ^. Given any two vectors x, x G Z'^^ , a 



function / G J-" if and only if 

5 ^ X ^ < /(x). (20) 



Using (20) and the definition of preferred order, we conclude that the function /(x) = 
X1 + X2 + ■ — h xm belongs to J-". If x is a queue length vector, then this function corresponds 
to the total number of queued packets in the system. 

4.3 The Optimality Results 

Let B <st C defines the usual stochastic ordering for two real-valued random variables B and 
C [9]. For the rest of this article, we say that a pohcy a G 11 'dominates^ another policy tt if 

/(X^(t)) /(X-(t)), AND /(Y-(t)) /(Y-(t)), Vt = l,2,..., (21) 

for all cost functions f G J-"; where X'^, respectively Y'^, is the SS (respectively RS) queue 
length vector under policy a. 



Note that from Equation (20) and the definition of stochastic ordering, X°'(t) ^ X'^(t) 
and Y'^(t) ^ Y'^(t), for all t and all sample paths in a suitable sample space, is sufficient 
for policy domination. The intended sample space is the standard one used in stochastic 
coupling [To] . 

In what follows, let X*^"^ and X'^ (respectively Y^^-^ and Y'^) represent the SS queue sizes 
(respectively RS queue sizes) under tt*^^ G 11^^^ and an arbitrary policy vr G 11. To prove 
the optimality of MB policies (i.e.. Theorem [2]), we need the following definitions and results. 
Define the following subsets of the set 11 of all feasible scheduling policies: (a) 11,- G 11, the 
set of policies that has the MB property during slots t < t, and are arbitrary for t > r. (b) 
G n, the set of policies that has the MB property during time slots t < r — 1 and during 
t = T choses the same controls mi(t) and U2{t) as those selected by a MB policy and an 
arbitrary us{t). Note that u^lr) may not be a MB control. 

From the above definitions we have 11 = IIq. Note that the set n„ for any t = n is not 
empty, since MB policies are elements of it. For n = 0, 1, . . ., n„ form a monotone sequence 



of subsets, such that n„ C c n„_i. In hght of the above, the set n can be defined as 
-qmb _ [^^^ n„. We will need the following lemmas to complete the proof of Theorem |2| 

Lemma 3. Given vr G II^-i, a policy vf G n"^ can he constructed, such that tx dominates tt. 

Lemma 4. Given vr G 11^^; a policy vf G Ilr can 6e constructed, such that tt dominates vr. 



The above lemmas provide a methodology to construct a MB policy from any arbitrary 
policy TT using stepwise improvements (i.e., by constructing policies that has the MB property 
for one extra time slot at every subsequent step) on the original policy while maintaining 
policy domination. The intermediate construction step (Lemma |3]) is necessary in order to 
simplify the coupling arguments used in the proof of the lemma. 

Theorem 2. Gonsider the system presented in Figure^ and described in Section^ A Most 
Balancing policy tt'^^^ G II^-'^-^ dominates any arbitrary policy for this system operation, i.e.. 



for a// TT G n and all cost functions f ^ T . 

Proof. Starting from an arbitrary policy vr, we apply a series of modifications to vr, using 
Lemmas [3] and |4| that result in a sequence of policies (tti, 7r2, . . .), such that: (i) policy tti 
dominates the original arbitrary policy vr, (ii) 7r„ G Hn, in other words, policy 7r„ has the MB 
property during time slots t = 1, 2, . . . ,n, and, (iii) iim dominates vr^ for m > n (i.e., when 
TTm has the MB property for a period of time m — n slots longer than 7r„). 

By definition, vr is an arbitrary policy; therefore, vr G IIo. We construct a policy vf G 11"^ 
that dominates vr according to Lemma |3| Using Lemma |4] we construct a second policy 
TTi G III that has the MB property during time slot t = 1 and dominates vr. Repeating 
the construction steps above and using Lemmas [3] and |4] again for time slots t = 2,3,... 
will result in a sequence of policies 7r„ G n„,n = 2, 3, . . . that satisfy (i) - (iii) above, i.e.. 



We present a proof for Lemmas [3] and |4] in the appendix. 



/(X*^^(t)) /(X-(t)), AND 

/(Y^^(t)) /(Y-(t)), Vt = l,2,... 



(22) 
(23) 



each subsequently constructed policy has the MB property for one more time slot (than the 
previous one) and dominates all the previous policies including the original policy vr. 

Denote the limiting policy for the sequence of constructed policies as n — > oo by vr*. In 
that case, vr* G H*^^ since it has the MB property at all time. Furthermore, we can conclude 
from the previous construction that vr* dominates TTn, for all n < cxd including the original 
policy TT. The theorem follows since the initial policy vr G 11 is assumed to be arbitrary. □ 



5 Conclusion 

In this work, we studied the wireless relay networks optimization problem from dynamic 
packet scheduling perspective. We provided a queueing model for these networks that takes 
into consideration the randomness of the wireless channel connectivity. We introduced a 
class of packet scheduling policies, the most balancing (MB) policies. We proved, using 
stochastic dominance and coupling method, that MB policies dominate all other policies in 
that they minimize, in stochastic ordering sense, a class of cost functions of the system queue 
lengths including the total number of packets in the system. We proposed an implementation 
algorithm and proved that it will produce a MB policy for the proposed queueing system. The 
results presented in this article provide a concrete understanding of the optimal scheduling 
policy structure in homogeneous wireless relay networks multi-hop wireless network in general. 



Appendix A Proof for Lemma [2] in Section |3.1 



In this section, we present the full proof for Lemma |2] This lemma quantifies the effect of 
performing a balancing interchange on the imbalance index fi;(x) of a vector x. 

Proof for Lemma^ To prove this lemma, we first show that: 

L L+1 L L+1 

E (^*' - 4) = E E - ^^■) - 2(5 - /) ■ i{.,>..+2} (A-i) 

i'=l j'=i'+l i=l j=i+l 



Then according to Equation (11), the above is equivalent to Equation (19) and Lemma [2] 
follows. 

We generate the vector x* by performing a balancing interchange of two components, I 
and s (i.e., the l^^ and the s*'* largest components), in the vector x and reorder the resulted 
vector in descending manner. The resulted vector x* is characterized by the following: 



xi - 1, 



(A-2) 



where /' (respectively s') is the new index (i.e., the order in the new vector x*) of component 
/ (respectively s) in the original vector x. 



From Equation (A-2) we can identify L — 2 elements that have the same magnitude in 



the two vectors x and x*. Therefore, the sum of differences between these L — 2 elements in 
both vectors will also be the same, i.e.. 



L+1 



L+1 



(A-3) 



i'=l j'=i'+l 



i=l j=i+l 
ii{l,s} j(^{l._s} 



We calculate the sums for the remaining terms (i.e., when at least one of the indices i,j 
belongs to {/, s} and/or i\j' belongs to {/', s'}) next. We first assume that xi > Xs + 2] in this 
case, we can easily show that /' < s'. Then, we have the following five, mutually exclusive, 
cases to consider: 



1. When i' = l',i = I, f = s' and j = s. This case occurs only once, i.e., when decomposing 



the double sum in Equation (A-1) we can find only one term that satisfies this case. 



From Equation (A-2) we have: x^, — a;*, = xi — Xg — 2. 



2. When i' = l',i = ^ s' and j ^ s. There are L — I terms that satisfy this case. 
Analogous to case 1) we determined that: x^, — x*, = xi — Xj — 1. 

3. When i' ^ l',i ^ = s' and j = s. There are s — 2 terms that satisfy this case. In 
this case we can show that: x*, — x*, = Xi — Xs — I- 



4. When i' ^ I', s',i ^ I, s,j' = I' and j = I. There are / — 1 terms that satisfy this case. 
In this case we can show that: x*, — x*i, = Xi — Xi + 1. 

5. When i' = s',i = s,j' ^ I', s' and j ^ I, s. There are L — s + 1 terms that satisfy this 
case. In this case we have: x*, — x*, = Xg — Xj + 1. 



The above cases cover all the terms in Equation ( A-1 ) when xi > Xs + 2. Combining all these 
terms yields: 



L L+1 L L+1 

[Xj^f X 
i'=l j'=i'+l 1=1 j=i+l 

L L+1 



E E (4-4)=E E(^^-^^)-2-(i)-i-(^-o-i-(s-2)+i-(/-i)+i-(i^-s+i) 

i=l j=i+l 
L L+1 

^ 5^(x,-a:,)-2(s-/) (A-4) 



1=1 j=i+l 



Furthermore, if xi = x^ + l, then from Equation (A-2) it is clear that and X*, = xi 



i.e., the resulted vector is a permutation of the original one. Therefore, the sum of differences 



will be the same in both vectors and Equation (A-1) will be reduced to 

L L+1 L L+1 

E/ ~ ^i'^ ~ E/ ~ 

i'=l j'=i'+l i=l j=i+l 



(A-5) 



Equation (A-1) follows from Equations (A-4) and (A-5). 



□ 



Appendix B Proof for Lemmas [3] and [4] 

To prove Lemmas [3] and |4] we use stochastic coupling arguments. We start by introducing 
the coupling method briefly. 

In order to compare probability measures on a measurable space, it is often possible to 
construct random elements on a common probability space with these measures as their 
distributions, such that this comparison can be conducted in terms of these random elements 
rather than the probability measures. Such construction is often referred to as stochastic 
coupling [10]. In the notation of [10], a formal definition of coupling of two probability 



measures on the measurable space {E,£) (the state space, e.g., E = 71,71'^, etc.) is given 
below. 

A random element in {E,S) is a quadruple {Q,^,P,X), where {Q,^,P) is the sample 
space and X is the class of measurable mappings from f2 to i^^ (X is an i^^-valued random 
variable, s.t. X-\B) G ^ for all B 

Definition: A coupling of the two random elements (fi,t^,P,X) and (fi', t^', P', X') in 
{E,£) is a random element (fi, P, (X, X')) in (E'^,£'^) such that 

X = X and X' = X', (B-1) 

where = denotes 'equal in distribution'. 

Stochastic coupling was initially used by mathematicians to prove properties for stochastic 
processes. Later on, coupling methods proved to be handy in proving optimality results for 
dynamic control of queueing systems, cf. [2T], [22], [H], [12] and many others. 

In the proof of Lemmas |3] and |4] we apply the coupling method as follows: For the 
scheduling policy vr, let w be a given sample path of the system state process. We construct 
a new sample path, u and a new policy tc. The details of this construction is given in 
the proof below. To put things into perspective, in the coupling definition (Equation ( B-l[ )), 
u = {co, u) and the "coupled" processes of interest in Equation (B-1 ) will be the SS queue sizes 
X = {X(?T,)} and X' = {X(n)} as well as the RS queue sizes Y = {Y{n)} and Y' = {Y(n)}. 

The scheduling policy selects three control elements at every time slot, namely Ui{t),U2{t) 
and u^lt). The detailed construction of policy tt is described in the proof below. Using 
Equations ^ and ([s]). We can compute the new queue states x(-), y(-) under tt and x(-), y(-) 
under vr. Our goal is to prove that the two relations 

x(t) ^ x(t) (B-2) 
y W ^ yit) (B-3) 

are satisfied for all t. This will insure the dominance of policy tt over tt. A queue length 
vector X is preferred over x (i.e., St ^ x.) iff one of the statements SI, S2 or S3 (in Section 
O) holds. 



Proof for Lemma^ To prove this lemma, we start from an arbitrary policy vr G n,-.! and a 
sample path uj = (x(l), y(l), c®(l), c'"(l), a(l), . . .). The proof proceeds in two parts; in Part 
1, we construct the sample path u and the policy n (as stated by Lemma [3]) for times up to 
t = T. In Part 2, we do the same for t > t. 

Part 1: For time t < t we construct u to coincide with u, i.e., a(t) = a(t), c^{t) = c^{t) 
and c'^it) = c^{t) for all t < t. We construct vr such that u{t) = u(t) for all t < t. Then the 
resulting queue lengths under both policies are the same, i.e., x(r) = x(r) and y(T) = y(T). 

At t = r, let c^(r) = c^(r), c^{t) = c^{t) and a(r) = a(r). We construct tt at t = r by 
selecting Ui{t), U2{t) and u^ij) as follows: 

1- Construction of ui{t). We have the following two cases to consider: 
(i) The scheduling control u(t) satisfies Equation (16) at t = r, i.e., ui{t) = l'^ : E 
argmaXjg£.^;i(-^)^;^ Xj(r). Then we set 'Ui(t) = Mi(r). Note that 'Ui(t) and mi(t) affect the SS 
queue lengths only and have no effect on the RS queue sizes. It follows that the resulting SS 
queue lengths x(r + 1) = x(r + 1) (a(r) = a(r) by construction), property (SI) holds true 
for the SS queue length vector and ( |B-2 ) is satisfied at t = r + 1. 



(ii) The scheduling control u{t) does not satisfy Equation (16) at t = r. Then we set 
'Ui(r) = /'^ : /'^ G argmaXjg£.gs(^)^i a;j(r). Keeping the construction of Co in mind, we conclude 
the following (we suppress the time argument for the subscript to simplify notation): 



^(r) + l, where (r) > (r) 



(B-4) 



From Equation (B-4) and the construction of the exogenous arrivals we conclude that 



property (S3) holds true for the SS queue length vector and (B-2) is satisfied at t = r + 1. 
2- Construction of 'U2{'^) and Usir). We have the following two cases to consider: 
(i) The scheduling control u{t) satisfies Equation (17) at t = r, i.e., U2{t) = s 



s' G 



argmaXjgj c[(r), where X = argmin^gj]^ ^^|?/j(r). Then we set U2{t) = U2{t) and u^ij) = 
M3(r). The resulting RS queue sizes y{T + 1) = y(r + 1). Property (SI) holds true for the 
RS queue length vector and (B-3) is satisfied at t = r + 1. 



(ii) The scheduling control u{t) does not satisfy Equation (17) at t = r. Then we set 



■U2(t) = : E argmaXjgj c[(r), where the set X is defined in case (i) above. We also set 
■U3(r) = u-si^r). In this case, the RS queue lengths satisfy (for all feasible selections of u^i^r)) 
the following: 

yu2{T + l) = ?/«2(r + l) + 1, yu^ir+l) = ^^^(r+l) - 1, where y^^ir + 1) < ^^^(r + 1). (B-5) 



Equation (B-5) suggests that y(r + l) is obtained from y(r + l) by performing a balancing 
interchange of two components £t2(r) and U2{t). In this case, property (S3) holds true for 



the RS queue length vector and (B-3) is satisfied at t = r + 1 



In cases (1-) and (2-) above, we constructed the policy tt for time slot t = t. We also 
showed that Relations ( |B-2 ) and (B-3) are satisfied for time slot t = t + 1. The above 
concluded the construction of policy vr upto time slot t = t. Next (in Part 2), we will 
construct vf for time slots t > t. Furthermore, starting from a preferred state at t = r + 1, 



we will show using forward induction that relations (B-2) and (B-3) are satisfied for all time 
slots t > T. 

Part 2: We use induction to complete part 2 of our proof. The sample path u and the 
policy TT are already defined for t < r. To complete the induction argument, we assume that 



71 and cD are defined up to time n — 1 > r and that relations (B-2) and (B-3) are satisfied 



at t = n, i.e., x(?t,) ^ x(n) and y{n) ^ y{n). We will show that at time slot n, vr can be 
constructed such that relations ( B-2[ ) and (B-3) are satisfied at t = n -|- 1. To do that, we 
have to show that either (SI), (S2) or (S3) holds for x(t) and y{t) at time slot t = n + 1. 

We consider next three cases that correspond to properties (SI), (S2) and (S3) of the 
vector x(n). For each one of these cases, we consider three sub-cases that correspond to 
properties (SI), (S2) and (S3) of the vector y{n). 

1- St{n) < x(n) (i.e., property (SI) holds for x(t)). We set Si{n) = a.{n) and c^{n) = c^{n). 



We set ui{n) = ui{n). The SS queue lengths satisfy (SI), i.e., x(n-|-l) < x(ri-|-l), and (B-2) 
holds at t = n + 1. The controls U2{n) and U3{n) are construction below and the RS queue 
lengths are computed as follows: 

(a) y(n) < y(n) (i.e., property (SI) holds for y(t)). We set c'"(n) = c'"(n). We also set the 



controls U2{n) = U2{n) and u^{n) = u^{n). In this case, (SI) is satisfied again and Relation 
( |B^ holds at t = n + 1. 

(b) y{n) is a 2-component permutation of y{n) (i.e., property (S2) holds for y(t)). Let 
RS queues i and j be the indices of the two permuted queues. Then let cliji) = c^jin), 
Cy(n) = c[(n) and c^(n) = c^(n),V/c 7^ j. We construct the controls U2{n) and M3(n) as 



follows: 



U2{n) 



i if M2(^) = j 
j if M2(^) = i 
k if M2(?^) = k,\/k ^ i,j 



U3[n 



i iiu3{n)=j 
j if U3{n) = i 
k if u^i^n) = k,Wk i,j 



(B-6) 



From the construction of vr, it can be easily shown that property (S2) is satisfied again 



for RS queues at time t = n + 1 and (B-3) follows. 



(c) y(n) is obtained from y{n) by performing a balancing interchange as described by 



Equation (12) (i.e., property (S3) holds for y(t)). Let i and j be the indices of the two RS 



queues involved in the balancing interchange, such that yi{n) > yj{n) + 1. We consider the 
following two cases: 

(i) yi{n) = yj{n) + 1. Then yi{n) = yj{n) and yj{n) = yi{n). This case corresponds to 
case (lb) above and the same construction of Co and vf apply. The resulted queue lengths at 



t = n + 1 will satisfy property (S2) and (B-3) follows. 



(ii) yiiji) > Vjin) + 1. We set c^{n) = c^{n) and U2{n) = U2{n). 

If ''c'j{n) = 1, c^(n) = 0, V/c 7^ j and yj{n) < 0"|^ (i.e., queue j is the only connected RS 
queue which happens to be empty), then u^^n) = (i.e., forced idling) according to feasibility 
constraint ([T]). Then we set U3{n) = j, which is a feasible control since yj{n) = yj{n) + 1 



according to Equation (12). The resulted RS queue length vector in this case satisfies (SI), 
i.e., y{n + 1) < y{n + 1), and (B-3) follows. 



Else, i.e., yjiji) > and/or there are other connected queues in the stack, then we set 
U3{n) = u^ln). This action preserve property (S3) and Equation (B-3) hold at t = n + 1. 



*Note that j = 0, the 'dummy' queue, is not excluded; hence, yj{n) < is possible in this particular case. 



This concludes the construction of Cj and vr for case (1) during time slot t = n. 
2- x(n) is a 2-component permutation of x(n) (i.e., property (S2) holds for x(t)). 
Let i and j be the indices of the two permuted SS queues. Then let cf{n) = 



Cj[n) 



Cj{n) = clin) and c|,(?2) 



{n),^k 7^ Similarly, di{n) 



aj[n), 



aj{n) 



aAn) and 



ak{n) = ak{n)yk ^ We construct the control ui{n) as follows: 



ui{n) 



i if Ui{n) 
j if'Ui(n) 
k if Ui{n) 



(B-7) 



k,\/k ^ i,j 

From Equation ( |B-7 ), it can be easily shown that, at time t = n + 1, property (S2) holds 



again for Sc(n + 1) and x(n + 1), and (B-2) is satisfied. Analogous to case (1-) above, we 



consider three cases for the construction of U2{n) and U3{n) that correspond to (SI), (S2) and 
(S3) properties of the vector y(n). The construction of u and vf in all three cases is analogous 
to that presented in cases (la), (lb) and (Ic) respectively, and the resulted RS queue length 



vector y('^ + 1) satisfies (B-3) at t = n + 1. 

3- x(n) is obtained from x(n) by performing a balancing interchange as described by 



Equation (12) Let i and j be the indices of the two SS queues involved in the balancing 



interchange, such that Xi{n) > Xj{n) + 1. We consider the following two cases: 

(i) Xi{n) = Xj{n) + 1. Then Xi{n) = Xj{n) and x.j{n) = Xj(n). This case corresponds to 
case (2-) above and the same construction of u and tt apply. The resulted queue lengths at 



c (n). 



t = n + 1 will satisfy property (S2) and (B-3) follows. 

(ii) Xi{n) > Xj{n) + 1. We set a(n) = a(n) and c^{n) 

If "c*(n) = 1, cliji) = 0,Wk j and Xj{n) < 0" (i.e., queue j is the only connected SS 
queue, which happens to be empty), then ui{n) = (i.e., forced idling) according to feasibility 
constraint ([?]). Then we set ui{n) = j. This is a feasible control since Xj{n) = Xj{n) + 1 



according to Equation (12). The resulted SS queue length vector in this case satisfies (SI), 



i.e., x(n + 1) < x(n + 1), and (B-2) is satisfied at t = n -|- 1. 



Else, i.e., if Xj{n) > and/or there are other connected queues in the stack, then we set 
ui{n) = uiiji). This action preserve property (S3) and Equation (B-2) is satisfied at t = n+1. 

In this case, as with the previous cases, there are three cases to consider for the construc- 
tion of U2{n) and u^{n) which correspond to (SI), (S2) and (S3) properties of the vector y(n). 
Again, the construction of Cj and vr in these cases is analogous to that presented in cases (la), 
(lb) and (Ic). The same conclusion regarding the resulted y{n + 1) is drawn. 

This concludes the construction of the policy vr at t = n, for ri > r. We have shown that 
this policy resulted in queue length vectors x(n + 1) and y{n + 1) that satisfy Equations 



(B-2) and (B-3) respectively. Using forward induction we conclude that these equations are 
satisfied for all t. Note that policy vr G n"^ by construction in Part 1; its dominance over vr 



follows from relation (20). □ 

Proof for Lemma^ The proof of this lemma is analogous to that of Lemma |3j The two 
proofs differ in the first part, where the policy is constructed for t < t. The second part of 
the proof is the same and will not be repeated. We start from an arbitrary policy vr G n"^ 
and a sample path uj = (x(l), y(l), c®(l), c'"(l), a(l), . . .). The proof proceeds in two parts; 
In part 1, we construct the sample path a) and the policy vr (as stated by Lemma |4]) for t < r. 
In part 2, we do the same for t > t. 

Part 1: For time t < r we construct Co to coincide with i.e., a(t) = a(t), c^{t) = c^{t) 
and c'"(t) = c'"(t) for all t < t. We construct vr such that u(t) = u{t) for all t < t. In this 
case, the resulting queue lengths under both policies at t = r are the same, i.e., x(r) = x(r) 
and y(r) = y(r). 

At time slot t = t, let c^(r) = c^(r), c^{t) = c^{t) and a(r) = a(r). Then for the 
construction of vr at t = r, we set Ui{t) = Mi(r) and U2{'t) = U2{r), where Ui{t) and U2{t) 
are most balancing controls. Then property (SI) holds true for the SS queue length vector 
and ( |B-2 ) is satisfied at t = r + 1. We construct U3{t) as follows: 

(i) If the scheduling control u(t) satisfies Equation (18) at t = r, i.e., 

U3{t) = r : r e argmax (?/j(r) + Vj{T)). 

j&fC:c-{r) = l 



Then we set Usir) = u^^t). Since U2{t) = '^^2(t) by construction, it follows that the resulting 
RS queue lengths y{T + 1) = y(r + 1); property (SI) holds true for the RS queue length 
vector and (B-3) is satisfied at t = r + 1. 



(ii) The scheduling control u(t) does not satisfy Equation ( jlsj ) at t = r. Then we set 

Us{t) = V -.V e argmax (?/j(r) + v]{t)). 

j(LK:d.(T) = l 

The RS queue lengths in this case satisfies the following (we suppress the time argument for 
the subscript to simplify notation): 



1, y„3(r+l) =y„3(r + l) + l, 



where yu^^r + 1) > yua{T + 1) 



(B-8) 



From Equation (B-8) we conclude that property (S3) holds true, i.e., y(T + 1) is obtained 
from y(r + 1) by performing a balancing interchange of two components Usir) and U3(r), and 



(B-3) is satisfied at t = r -|- 1. 



The above concludes the construction of policy vr for time slot t < t. By construction. 



71 has the MB property during time slot r. We showed that Relations (B-2) and (B-3) are 



satisfied for time slot t = r + 1. Next (in part 2), we will construct vr for time slots t > t. 
Furthermore, starting from a preferred state at t = t-|-1, we will show using forward induction 



that Relations (B-2) and (B-3) are satisfied for all time slot t > t. 



Part 2: we assume that vr and u are defined up to time n — 1 > t and that relations (B-2) 



and (B-3) are satisfied at t = n, i.e., x(n) ^ x(r;,) and y('^) ^ y('^)- We will show that at time 



slot n, 71 can be constructed such that relations (B-2) and (B-3) are satisfied at t = n + 1. 



There are three cases to be considered. These cases correspond to properties (SI), (S2) 
and (S3) of the vector x(r;,). For each one of these cases, we consider three sub-cases that 
correspond to properties (SI), (S2) and (S3) of the vector y(n). The construction of u and 
vr for all these cases is the same as the construction carried out in Part 2 of the proof for 
Lemma [3] and it will not be repeated here. Same conclusions are valid here, i.e., relations 



(B-2) and (B-3) are satisfied at t = n + 1. 



Part 2 above provide a complete description of the policy tt at t = n, for some n > t. This 



policy resulted in queue length vectors x(?2 + 1) and y{n + 1) that satisfy Equations (B-2) 



and (B-3) respectively. Using forward induction, we prove that these equations are satisfied 
for all t > T. Part 1 and Part 2 above constructs the policy vr for all t such that the preferred 
order is preserved. Note that policy vf G 11,- by construction in Part 1; its dominance over tt 



follows from relation (20). □ 
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